24 research outputs found

    SANA NetGO: A combinatorial approach to using Gene Ontology (GO) terms to score network alignments

    Full text link
    Gene Ontology (GO) terms are frequently used to score alignments between protein-protein interaction (PPI) networks. Methods exist to measure the GO similarity between two proteins in isolation, but pairs of proteins in a network alignment are not isolated: each pairing is implicitly dependent upon every other pairing via the alignment itself. Current methods fail to take into account the frequency of GO terms across the networks, and attempt to account for common GO terms in an ad hoc fashion by imposing arbitrary rules on when to "allow" GO terms based on their location in the GO hierarchy, rather than using readily available frequency information in the PPI networks themselves. Here we develop a new measure, NetGO, that naturally weighs infrequent, informative GO terms more heavily than frequent, less informative GO terms, without requiring arbitrary cutoffs. In particular, NetGO down-weights the score of frequent GO terms according to their frequency in the networks being aligned. This is a global measure applicable only to alignments, independent of pairwise GO measures, in the same sense that the edge-based EC or S3 scores are global measures of topological similarity independent of pairwise topological similarities. We demonstrate the superiority of NetGO by creating alignments of predetermined quality based on homologous pairs of nodes and show that NetGO correlates with alignment quality much better than any existing GO-based alignment measures. We also demonstrate that NetGO provides a measure of taxonomic similarity between species, consistent with existing taxonomic measures--a feature not shared with existing GO-based network alignment measures. Finally, we re-score alignments produced by almost a dozen aligners from a previous study and show that NetGO does a better job than existing measures at separating good alignments from bad ones

    Defining Equitable Geographic Districts in Road Networks via Stable Matching

    Full text link
    We introduce a novel method for defining geographic districts in road networks using stable matching. In this approach, each geographic district is defined in terms of a center, which identifies a location of interest, such as a post office or polling place, and all other network vertices must be labeled with the center to which they are associated. We focus on defining geographic districts that are equitable, in that every district has the same number of vertices and the assignment is stable in terms of geographic distance. That is, there is no unassigned vertex-center pair such that both would prefer each other over their current assignments. We solve this problem using a version of the classic stable matching problem, called symmetric stable matching, in which the preferences of the elements in both sets obey a certain symmetry. In our case, we study a graph-based version of stable matching in which nodes are stably matched to a subset of nodes denoted as centers, prioritized by their shortest-path distances, so that each center is apportioned a certain number of nodes. We show that, for a planar graph or road network with nn nodes and kk centers, the problem can be solved in O(nnlogn)O(n\sqrt{n}\log n) time, which improves upon the O(nk)O(nk) runtime of using the classic Gale-Shapley stable matching algorithm when kk is large. Finally, we provide experimental results on road networks for these algorithms and a heuristic algorithm that performs better than the Gale-Shapley algorithm for any range of values of kk.Comment: 9 pages, 4 figures, to appear in 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL 2017) November 7-10, 2017, Redondo Beach, California, US

    Stable-Matching Voronoi Diagrams: Combinatorial Complexity and Algorithms

    Get PDF
    We study algorithms and combinatorial complexity bounds for stable-matching Voronoi diagrams, where a set, S, of n point sites in the plane determines a stable matching between the points in R^2 and the sites in S such that (i) the points prefer sites closer to them and sites prefer points closer to them, and (ii) each site has a quota indicating the area of the set of points that can be matched to it. Thus, a stable-matching Voronoi diagram is a solution to the classic post office problem with the added (realistic) constraint that each post office has a limit on the size of its jurisdiction. Previous work provided existence and uniqueness proofs, but did not analyze its combinatorial or algorithmic complexity. We show that a stable-matching Voronoi diagram of n sites has O(n^{2+epsilon}) faces and edges, for any epsilon>0, and show that this bound is almost tight by giving a family of diagrams with Theta(n^2) faces and edges. We also provide a discrete algorithm for constructing it in O(n^3+n^2f(n)) time, where f(n) is the runtime of a geometric primitive that can be performed in the real-RAM model or can be approximated numerically. This is necessary, as the diagram cannot be computed exactly in an algebraic model of computation

    SANA: simulated annealing far outperforms many other search algorithms for biological network alignment

    Full text link
    SummaryEvery alignment algorithm consists of two orthogonal components: an objective function M measuring the quality of an alignment, and a search algorithm that explores the space of alignments looking for ones scoring well according to M . We introduce a new search algorithm called SANA (Simulated Annealing Network Aligner) and apply it to protein-protein interaction networks using S 3 as the topological measure. Compared against 12 recent algorithms, SANA produces 5-10 times as many correct node pairings as the others when the correct answer is known. We expose an anti-correlation in many existing aligners between their ability to produce good topological vs. functional similarity scores, whereas SANA usually outscores other methods in both measures. If given the perfect objective function encoding the identity mapping, SANA quickly converges to the perfect solution while many other algorithms falter. We observe that when aligning networks with a known mapping and optimizing only S 3 , SANA creates alignments that are not perfect and yet whose S 3 scores match that of the perfect alignment. We call this phenomenon saturation of the topological score . Saturation implies that a measure's correlation with alignment correctness falters before the perfect alignment is reached. This, combined with SANA's ability to produce the perfect alignment if given the perfect objective function, suggests that better objective functions may lead to dramatically better alignments. We conclude that future work should focus on finding better objective functions, and offer SANA as the search algorithm of choice.Availability and implementationSoftware available at http://sana.ics.uci.edu [email protected] informationSupplementary data are available at Bioinformatics online

    Automatic evaluation of top-down predictive parsing

    Get PDF
    We develop efficient methods to check whether two given Context-Free Grammars (CFGs) are transformed into parsers that recognize the same language and construct the same Abstract Syntax Trees (ASTs) for each input. In this setting, we consider a model of top-down predictive parser generator with directives for AST construction that is a simplified variant of PCCTS/ANTLR3. As an application, we implement an evaluator for an online judge with educational purposes in the context of a Compilers course.Preprin
    corecore